Genomic characterization and virulence of Streptococcus suis serotype 4 clonal complex 94 recovered from human and swine samples

Streptococcus suis is a zoonotic pathogen that causes invasive infections in humans and pigs. Herein, we performed genomic analysis of seven S. suis serotype 4 strains belonging to clonal complex (CC) 94 that were recovered from a human patient or from diseased and clinically healthy pigs. Genomic exploration and comparisons, as well as in vitro cytotoxicity tests, indicated that S. suis CC94 serotype 4 strains are potentially virulent. Genomic analysis revealed that all seven strains clustered within minimum core genome group 3 (MCG-3) and had a high number of virulence-associated genes similar to those of virulent serotype 2 strains. Cytotoxicity assays showed that both the human lung adenocarcinoma cell line and HeLa cells rapidly lost viability following incubation for 4 h with the strains at a concentration of 106 bacterial cells. The human serotype 4 strain (ID36054) decreased cell viability profoundly and similarly to the control serotype 2 strain P1/7. In addition, strain ST1689 (ID34572), isolated from a clinically healthy pig, presented similar behaviour in an adenocarcinoma cell line and HeLa cells. The antimicrobial resistance genes tet(O) and ermB that confer resistance to tetracyclines, macrolides, and lincosamides were commonly found in the strains. However, aminoglycoside and streptothricin resistance genes were found only in certain strains in this study. Our results indicate that S. suis CC94 serotype 4 strains are potentially pathogenic and virulent and should be monitored.

Introduction Susceptibility to penicillin was determined by the minimum inhibitory concentration (MIC) following the M100 (32 nd edn) Clinical and Laboratory Standard Institute (CLSI-M100) guidelines [24] using the Liofilchem 1 MIC Test Strip according to the manufacturer's instructions (Liofilchem, Italy). We followed the standards defined in the 2022 CLSI-M100 guidelines to classify the penicillin susceptibility of the strains (MIC �0.12 μg/ ml = susceptible; MIC 0.25-2 μg/ml = intermediate; MIC �4 μg/ml = resistant) [24]. Susceptibility to other antimicrobials, such as ceftriaxone, cefepime, azithromycin, erythromycin, tetracycline, clindamycin, levofloxacin, and chloramphenicol, was determined using the disk diffusion technique following the 2022 CLSI-M100 guideline [24]. Since there are currently no breakpoints recommended for S. suis, those for viridans group streptococci were used, as defined in the guidelines [24]. Based on the CLSI guidelines for viridans group streptococci, MIC testing to penicillin was performed given that CLSI guidelines do not provide zone diameter breakpoints for this antibiotic [24]. Therefore, we performed the MIC test strip procedure for penicillin and disc diffusion for the other antibiotics. Streptococcus pneumoniae strain ATCC 49619 was used as a control.

Whole-genome sequencing
The genomes of all seven strains were sequenced using Illumina technology. Additionally, the genomes of three strains (ID36054, ID34572, and TRG6) were sequenced using Oxford Nanopore Technologies (ONT) as described elsewhere [25]. Briefly, Illumina sequencing libraries were generated using the NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs, UK) following the manufacturer's recommendations. The genomic DNA was randomly fragmented to a size of 350 bp, and the fragments were A-tailed and ligated with the adapter. Libraries were sequenced as paired-end reads (150 + 150 bp) using a HiSeq 2500 instrument. The sequencing adapters were trimmed using Fastp v0. 19.5 (https://github.com/ OpenGene/fastp), and the quality of clean reads was determined using FastQC v0.11.8 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
Library preparation for ONT sequencing followed the rapid barcoding DNA sequencing protocol of the SQK-RBK004 kit without DNA size selection (which preserves the plasmid DNA), and the libraries were sequenced using a single R9.4.1/FLO-MIN106 flow cell on a MinION Mk1B sequencer. We performed base calling and demultiplexed the raw data using Guppy v3.4.5 (ONT). The ONT adapters were trimmed using Porechop v0.2.4 (https://github. com/rrwick/Porechop). Quality control of ONT reads was carried out with Nanoplot v1.28.1 (https://github.com/wdecoster/NanoPlot). Hybrid assemblies with the ONT and Illumina data were generated using Unicycler v0.4.8 [26], and the genome sequences were checked for quality using QUAST v5.0.2 [27]. Genome sequences were submitted to the NCBI Prokaryotic Genome Annotation Pipeline (PGAP v4.12) for annotation. The default parameters were used for all software unless otherwise specified. The genome sequences of the seven S. suis serotype 4 strains were deposited in the NCBI GenBank under Bioproject accession number PRJNA691075 with GenBank accession numbers shown in Table 1.

Bioinformatics analysis
Antimicrobial resistance genes were detected using ResFinder 4.1 [28]. Plasmid replicons were analysed using PlasmidFinder 2.1 and PLACNETw [29,30]. Sequence type (ST) was confirmed using the PubMLST database (https://pubmlst.org/organisms/streptococcus-suis). GoeBURST was used to analyse STs in CC94 [31]. Minimum core genome (MCG) sequence typing was performed according to a procedure described previously [20]. We used MyDbFinder 2.0, Center for Genomic Epidemiology, (https://cge.food.dtu.dk/services/MyDbFinder/), to screen the genomes of the serotype 4 strains for the presence of 99 virulence-associated genes (VAGs) that have been described as important for S. suis virulence or pathogenesis (S1 Table). The same approach was used to screen the genomes for the presence of two genes (G15: ATP-binding protein and G20: hypothetical protein) specific to human-associated clades (HAC) and pathogenic pathotype markers, including a copper-exporting ATPase 1, a type I restrictionmodification system S protein, gene SSU_RS03100 (hypothetical protein), gene SSU_RS09155 (hypothetical protein), and gene SSU-RS09525 (RNA-binding protein) [19,32,33]. Out of 99  virulence-associated genes (VAGs), the presence or absence of 22 VAGs that were described in a previous study [34] was determined using unweighted average linkage (UPGMA) with the DendroUPGMA program as described elsewhere [25]. A dataset containing 97 curated S. suis CC94 genomes [35] was used in combination with our seven serotype 4 genomes generated in this study to construct the phylogeny of the S. suis CC94 population (S2 Table). The phylogeny of the CC94 strains was determined using a reference genome-based single-nucleotide polymorphism (SNP) strategy with REALPHY [36]. The phylogenetic tree was visualized using iTOL V4 software [37]. S. suis serotype 2 strain P1/7 (accession no. CP003736) was used as the reference genome for SNP analysis.
Pangenome analyses were performed with the anvi'o v7 workflow [38]. This workflow identified gene clusters and single-copy genes in the study genomes, including three serotype 4 strains (ID34572, ID36054, and TRG6), two serotype 2 genomes of epidemic strain SC84 (accession no. FM252031) and the highly virulent strain P1/7 [17]. All genomes, in fastA format, were submitted for pangenome analysis using the 'anvi-run-workflow' script. Genes were annotated using anvi-run-ncbi-cogs. All genomes were added to a new anvi'o genome storage using the 'anvi-gen-genomes-storage' application. Then, the program 'anvi-pan-genome' was used to run pan-genomic analysis on all the stored genomes using NCBI's blastp tool (https:// blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_ LOC=blasthome). We used 'anvi-import-misc-data' to import additional metadata and 'anvicompute-genome-similarity' to compute the average nucleotide identity (ANI) using the pyANI tool (https://github.com/widdowquinn/pyani). The pangenome was visualized in anvi'o using the 'anvi-display-pan' application. The whole pangenome was divided into core and accessory bins based on gene cluster frequency. The UpSetR plot was generated using the UpSetR package [39] in the R program to visualize gene overlaps across bacterial strains. Specifically, gene lists from the pangenome results were prepared and input into the UpSetR package to generate the plots.

Cell cytotoxicity assays
A human lung adenocarcinoma cell line (A549) and a human cervical cancer cell line (HeLa) were used to determine the cytotoxicity of four selected S. suis serotype 4 CC94 strains, including three ST94 strains (ID36054, ID34693, TRG6) and one ST1689 strain (ID34572). These two cell lines had previously been used to study interactions with S. suis serotype 2 strains [40]. These cell lines were purchased from the American Type Culture Collection (ATCC, MD, USA). Three ST94 strains were selected based on their representation of strains isolated from humans, diseased pigs and asymptomatic pigs, whereas the remaining strain was a representative ST1689 strain from asymptomatic pigs ( Table 1). The serotype 2 ST1 strain P1/7 was used as a control. The A549 and HeLa cells were cultured in RPMI1640 (Gibco; Thermo Fisher Scientific) and DMEM (Gibco; Thermo Fisher Scientific), respectively, and supplemented with 10% foetal bovine serum, 100 U/ml of penicillin (Gibco; Thermo Fisher Scientific) and 100 mg/ml streptomycin (Gibco; Thermo Fisher Scientific). They were incubated at 37˚C with 5% CO 2 . All four strains of S. suis were cultured overnight on sheep blood agar at 37˚C with 5% CO 2 . S. suis inoculum was prepared in RPMI1640 or DMEM depending on cell types at concentrations of 1 × 10 3 , 1 × 10 4 , 1 × 10 5 and 1 × 10 6 CFU/ml. The human epithelial cells were infected with the S. suis P1/7 control strain and the S. suis serotype 4 strains at four concentrations for 2, 4, or 18 h, and subsequently, the effect of S. suis infection was determined using the CCK-8 assay (Merck, Darmstadt, Germany) according to the manufacturer's instructions. This assay was performed in at least triplicate.

MLST and MCG analysis
Hybrid Nanopore-Illumina assemblies allowed us to obtain high-quality genomes (1 or 2 contigs for two and one strains, respectively), while genome assemblies for the four strains sequenced by Illumina had only between 70 and 138 contigs. Comprehensive statistics for genome assemblies are provided in Table 1. Among the seven strains in the current study, only strain TRG6 from a diseased pig contained a plasmid (6,890 bp); however, the replicon type could not be identified by either PlasmidFinder 2.1 or PLACNETw (Table 1). This plasmid carried seven hypothetical protein genes, one vanZ family protein gene, and an unidentified replication protein gene.
Genome-based MLST analysis of the seven S. suis serotype 4 strains confirmed that four strains were ST94 (ID36054, TRG6, ID34693 and ID34704) and three strains (ID34572, ID34545, ID34553) were ST1689; both STs were included among the CC94 strains ( Table 1). As shown in Fig 1, CC94 is comprised of 91 STs based on sequences available in the PubMLST database as of Dec 9, 2022. ST1689 strains are a single allele variant of the dpr gene relative to ST94. A previous study demonstrated that CC1, CC28, CC94, and CC104 strains are associated with a pathogenic pathotype [21]. In CC94, STs 94, 108, and 977 were considered a "pathogenic pathotype" [21].
Analysis of MCG groups showed that all seven strains belonged to MCG group 3 (Table 1). MCG group 3 has been shown to include isolates from either diseased or clinically healthy pigs, which possess a higher number of virulence-associated genes than MCG groups 4 through 7, suggesting an increased potential for virulence [20].
As shown in Fig 2, the genetic region carrying the tet(O) and erm(B) genes was similar among serotype 4 strains, the only difference being that a hypothetical protein gene downstream of erm(B) is found in strains ID34545, ID34553, ID34693, and ID34704, which is absent from the three other strains. The organization of the tet(O) and erm(B) genes is different in serotypes 4, 21, and 24 [25,56]. The cooccurrence of erm(B) and tet(O) has been reported in 69.06% (221/320) of S. suis from China and was also frequently detected in isolates from other countries [49,[54][55][56][57][58][59].
It has been shown that some antimicrobial resistance genes are located on genomic islands integrated at the rpsI locus, examples of which are ant (9) (6), which confer resistance to macrolides, lincosamides, and aminoglycosides [48]. Strains ID34572, ID34545, and ID34553 possessed the genes ant (9)
Compared with other studies, it seems that the putative virulence-associated genes have a differential distribution by S. suis lineages, implying that genes correlated with virulence may differ between lineages. Based on our analysis, serotype 4 CC94 strains carried many VAGs that might be considered potentially virulence, concordant with a previous study [21].

Pathogenic pathotype determinants
Two previous studies have described markers of pathogenic pathotypes (or markers of diseaseassociated strains) [32,33]. These are a copper-exporting ATPase 1, a type I restriction-modification system S protein, the gene SSU_RS03100 (hypothetical protein), the gene SSU_RS09155 (hypothetical protein), and the gene SSU-RS09525 (RNA-binding protein) [32,33]. Conversely, a putative sugar ATP-binding cassette transporter gene is a marker for non-pathogenic pathotype strains [32]. The CC94 serotype 4 strains in this study possessed all five pathogenic marker genes but lacked the non-pathogenic pathotype marker genes (S1 Table). This suggests that they may have virulence potential, which is consistent with the fact that two of the strains were isolated from an ill human patient and a diseased pig. The remaining five strains were isolated from clinically healthy pigs, which may indicate that clinically healthy pigs act as reservoirs of the pathogenic S. suis pathotype. A previous study demonstrated that S. suis strains with genotypes identical to pathogenic human strains were detected in asymptomatic healthy pigs [12]. In addition, CC94 was associated with pathogenic strains [21]. However, we cannot rule out that these pathogenic pathotype markers may fail to truly differentiate between nonpathogenic and pathogenic pathotypes, as is discussed elsewhere [64]. Further studies aimed at evaluating these five pathogenic pathotype markers in S. suis strains of different serotypes, sequence types, isolation sources, and geographic regions are required to better understand their usefulness as virulence predictors.
Dong et al proposed a panel of 25 marker genes as being strongly associated with human infections [19]. Among these, two genes (G15: ATP-binding protein and G20: hypothetical protein) were selected to be representative of the human-associated clade described in a previous study [19]. Analysis of these two HAC marker genes in our CC94 serotype 4 strains revealed that they were absent from the genomes of the strains, even though one of the strains was isolated from an ill human patient. It might be possible that these two HAC marker genes may be present in restricted S. suis populations or strains of some specific CCs. Therefore, more extensive analysis of these marker genes should be conducted to assess their capacity to predict whether a strain can cause human infections.  We next compared the complete genomes of three serotype 4 strains, namely, strains ID36054 (ST94), TRG6 (ST94), and ID34572 (ST1689) (Fig 5A and 5B). A total of 2,018 coding sequences were found in these three strains. Strain TRG6 had 10 unique genes, while seven and six genes were unique to strains ID34572 and ID36054, respectively ( Table 2; Fig 5B). TRG6 and ID34572 shared 65 genes not present in ID36054, while nine genes were shared between TRG6 and ID36054 but not ID34572. Table 2 shows a summary of the unique genes found in each of the three strains.

Genomic comparison
We compared these three serotype 4 complete genomes with the representative genomes of serotype 2 epidemic strain SC84 (ST7) and serotype 2 highly virulent strain P1/7 (ST1). As shown in Fig 5C and 5D, the pangenomes of three serotype 4 and two serotype 2 strains revealed gene homologues and non-homologues. An ANI heatmap tree showed two clusters of gene contents (homologues and non-homologues) in serotype 4 and 2 strains (Fig 5C). A total of 1,694 genes were common between serotypes 4 and 2 strains, whereas 271 genes were present in the genomes of the serotype 4 strains but absent from the genomes of the representative virulent strains of serotype 2 (Fig 5D). Only 138 genes were unique to the serotype 2 strains, whereas 36 genes were shared between serotype 2 strain SC84 (ST7) and all three serotype 4 strains (Fig 5D). Each of the seven genes was unique to ID34572 and TRG6, whereas six unique genes were present in ID36054 (Fig 5D). Of note, a putative membrane-associated protein and aminoglycoside-6-adenylyltransferase genes were found in strain TRG6 and the two serotype 2 strains. Among the unique sequences present in the serotype 4 strains, we identified genes encoding antibiotic-resistance proteins (lincosamide, aminoglycoside, streptothricin), DNA processing proteins (single-strand binding protein, recombinase, invertase, transposases), domain-containing proteins (WYL, HTH, GNAT, RibD) and hypothetical proteins ( Table 2).

Cell cytotoxicity
To begin to evaluate the virulence of serotype 4 CC94 strains, we chose three ST94 (TRG6, ID36054, ID34693) and one ST1689 (ID34572) strains, as they represented clinical cases from humans and pigs, as well as isolates from clinically healthy pigs. As shown in Fig 6, the A549 cell line showed higher susceptibility to S. suis serotype 4 and the control P1/7 strains than the HeLa cell line. At an infective dose of 1 x 10 6 bacteria, both A549 and HeLa cells rapidly lost viability 4 h post-infection. Interestingly, cells infected with the human serotype 4 strain (ID36054) showed a decrease in cell viability comparable to that observed for the highly virulent control P1/7 strain, which contrasted with that of strain TRG6 (from a diseased pig). The ST1689 strain (ID34572) also induced rapid cell viability loss.
Because all tested strains in the current study carried suilysin (sly), the presence of suilysin is likely cytotoxic; this characteristic has been described elsewhere [65,66]. Several studies have shown that suilysin plays a role in pathogenesis; for example, suilysin induces membrane ruffling and uptake by epithelial cells by manipulating the host cell cytoskeleton [67], induces platelet aggregation [68], induces platelet-neutrophil complex formation [69], enhances blood-brain barrier permeability by releasing arachidonic acid in brain microvascular endothelial cells [70], stimulates the release of heparin-binding protein from polymorphonuclear neutrophils [71], and induces TNFα release by monocytes [72].
The virulence of S. suis serotype 2 strains has been extensively characterized using both in vitro and in vivo models of infection; however, virulence studies of non-serotype 2 strains have only recently begun to be conducted. In one example, a S. suis serotype 31 strain (strain 11LB5) induced neurological symptoms in mice similar to those caused by a serotype 2 strain, Table 2. Presentation of unique genes by comparison among three Streptococcus suis serotype 4 clonal complex 94 strains and two Streptococcus suis serotype 2 strains, SC84 (epidemic strain) and P1/7 (highly virulent strain). while a S. suis serotype 28 (strain 11313) was non-virulent in mouse infection models [55]. An additional study showed that seven out of 47 S. suis serotype 31 strains isolated from clinically healthy pigs were also pathogenic in a zebrafish infection model [73]. S. suis serotype 8 strains were also proven to be virulent in mice and zebrafish [74]. A serotype 7-ST29 strain showed high survival in porcine blood that was obtained after the weaning of pigs and the strain caused meningitis and arthritis in an experimental infection of weaning piglets [75]. In addition, S. suis serotype 7 strains, including an MCG-3 strain isolated from a human patient, were lethal to experimentally infected mice [6]. Although we did not perform a full characterization of the virulence of serotype 4 CC94 strains, our cytotoxicity data together with our genetic findings suggest that some serotype 4 CC94 strains may be virulent. Further virulence studies, including in in vivo infection models, should be conducted to test this hypothesis.

Conclusion
Genomic exploration and cytotoxicity tests of our S. suis serotype 4 CC94 strains isolated from patients, diseased pigs, and clinically healthy pigs revealed that they could be potentially virulent. They carried many virulence-associated genes often found in virulent serotype 2 strains and were cytotoxic to two cell lines. In addition to their potential pathogenicity, serotype 4 CC94 strains in the current study are carriers of the antimicrobial resistance genes tet(O) and ermB, which confer resistance to tetracycline, macrolides, and lincosamides.
Supporting information S1